Your browser doesn't support javascript.
loading
Show: 20 | 50 | 100
Results 1 - 20 de 38
Filter
1.
Physiol Rep ; 12(8): e16015, 2024 Apr.
Article in English | MEDLINE | ID: mdl-38653581

ABSTRACT

Adaptation of humans to challenging environmental conditions, such as extreme temperature, malnutrition, or hypoxia, is an interesting phenomenon for both basic and applied research. Identification of the genetic factors contributing to human adaptation to these conditions enhances our understanding of the underlying molecular and physiological mechanisms. In our study, we analyzed the exomes of 22 high altitude mountaineers to uncover genetic variants contributing to hypoxic adaptation. To our surprise, we identified two putative loss-of-function variants, rs1385101139 in RTEL1 and rs1002726737 in COL6A1 in two extremely high altitude (personal record of more than 8500 m) professional climbers. Both variants can be interpreted as pathogenic according to medical geneticists' guidelines, and are linked to inherited conditions involving respiratory failure (late-onset pulmonary fibrosis and severe Ullrich muscular dystrophy for rs1385101139 and rs1002726737, respectively). Our results suggest that a loss of gene function may act as an important factor of human adaptation, which is corroborated by previous reports in other human subjects.


Subject(s)
Altitude , Collagen Type VI , Respiratory Insufficiency , Humans , Collagen Type VI/genetics , Male , Respiratory Insufficiency/genetics , Adult , Mountaineering , Female , Exome Sequencing/methods , Middle Aged , Altitude Sickness/genetics
2.
Brief Bioinform ; 25(2)2024 Jan 22.
Article in English | MEDLINE | ID: mdl-38271481

ABSTRACT

Next-generation sequencing (NGS) has revolutionized the field of rare disease diagnostics. Whole exome and whole genome sequencing are now routinely used for diagnostic purposes; however, the overall diagnosis rate remains lower than expected. In this work, we review current approaches used for calling and interpretation of germline genetic variants in the human genome, and discuss the most important challenges that persist in the bioinformatic analysis of NGS data in medical genetics. We describe and attempt to quantitatively assess the remaining problems, such as the quality of the reference genome sequence, reproducible coverage biases, or variant calling accuracy in complex regions of the genome. We also discuss the prospects of switching to the complete human genome assembly or the human pan-genome and important caveats associated with such a switch. We touch on arguably the hardest problem of NGS data analysis for medical genomics, namely, the annotation of genetic variants and their subsequent interpretation. We highlight the most challenging aspects of annotation and prioritization of both coding and non-coding variants. Finally, we demonstrate the persistent prevalence of pathogenic variants in the coding genome, and outline research directions that may enhance the efficiency of NGS-based disease diagnostics.


Subject(s)
Computational Biology , Rare Diseases , Humans , Rare Diseases/diagnosis , Rare Diseases/genetics , Genomics , Genome, Human , Germ Cells , High-Throughput Nucleotide Sequencing
3.
Int J Mol Sci ; 24(24)2023 Dec 12.
Article in English | MEDLINE | ID: mdl-38139235

ABSTRACT

Type 2 diabetes mellitus (T2D) is a chronic metabolic disease characterized by insulin resistance and ß-cell dysfunction and leading to many micro- and macrovascular complications. In this study we analyzed the circulating miRNA expression profiles in plasma samples from 44 patients with T2D and 22 healthy individuals using next generation sequencing and detected 229 differentially expressed miRNAs. An increased level of miR-5588-5p, miR-125b-2-3p, miR-1284, and a reduced level of miR-496 in T2D patients was verified. We also compared the expression landscapes in the same group of patients depending on body mass index and identified differential expression of miR-144-3p and miR-99a-5p in obese individuals. Identification and functional analysis of putative target genes was performed for miR-5588-5p, miR-125b-2-3p, miR-1284, and miR-496, showing chromatin modifying enzymes and apoptotic genes being among the significantly enriched pathways.


Subject(s)
Diabetes Mellitus, Type 2 , Insulin Resistance , MicroRNAs , Humans , Diabetes Mellitus, Type 2/genetics , Pilot Projects , MicroRNAs/metabolism , Gene Expression Profiling
4.
Int J Mol Sci ; 24(24)2023 Dec 17.
Article in English | MEDLINE | ID: mdl-38139401

ABSTRACT

Pregnancy loss is the most frequent complication of a pregnancy which is devastating for affected families and poses a significant challenge for the health care system. Genetic factors are known to play an important role in the etiology of pregnancy loss; however, despite advances in diagnostics, the causes remain unexplained in more than 30% of cases. In this review, we aggregated the results of the decade-long studies into the genetic risk factors of pregnancy loss (including miscarriage, termination for fetal abnormality, and recurrent pregnancy loss) in euploid pregnancies, focusing on the spectrum of point mutations associated with these conditions. We reviewed the evolution of molecular genetics methods used for the genetic research into causes of pregnancy loss, and collected information about 270 individual genetic variants in 196 unique genes reported as genetic cause of pregnancy loss. Among these, variants in 18 genes have been reported by multiple studies, and two or more variants were reported as causing pregnancy loss for 57 genes. Further analysis of the properties of all known pregnancy loss genes showed that they correspond to broadly expressed, highly evolutionary conserved genes involved in crucial cell differentiation and developmental processes and related signaling pathways. Given the features of known genes, we made an effort to construct a list of candidate genes, variants in which may be expected to contribute to pregnancy loss. We believe that our results may be useful for prediction of pregnancy loss risk in couples, as well as for further investigation and revealing genetic etiology of pregnancy loss.


Subject(s)
Abortion, Habitual , Point Mutation , Pregnancy , Female , Humans , Abortion, Habitual/genetics
5.
Genes (Basel) ; 14(11)2023 Nov 18.
Article in English | MEDLINE | ID: mdl-38003043

ABSTRACT

Phenotypicheterogeneity is a phenomenon in which distinct phenotypes can develop in individuals bearing pathogenic variants in the same gene. Genetic factors, gene interactions, and environmental factors are usually considered the key mechanisms of this phenomenon. Phenotypic heterogeneity may impact the prognosis of the disease severity and symptoms. In our work, we used publicly available data on the association between genetic variants and Mendelian disease to investigate the genetic factors (such as the intragenic localization and type of a variant) driving the heterogeneity of gene-disease relationships. First, we showed that genes linked to multiple rare diseases (GMDs) are more constrained and tend to encode more transcripts with high levels of expression across tissues. Next, we assessed the role of variant localization and variant types in specifying the exact phenotype for GMD variants. We discovered that none of these factors is sufficient to explain the phenomenon of such heterogeneous gene-disease relationships. In total, we identified only 38 genes with a weak trend towards significant differences in variant localization and 30 genes with nominal significant differences in variant type for the two associated disorders. Remarkably, four of these genes showed significant differences in both tests. At the same time, our analysis suggests that variant localization and type are more important for genes linked to autosomal dominant disease. Taken together, our results emphasize the gene-level factors dissecting distinct Mendelian diseases linked to one common gene based on open-access genetic data and highlight the importance of exploring other factors that contributed to phenotypic heterogeneity.


Subject(s)
Rare Diseases , Humans , Rare Diseases/genetics , Phenotype , Prognosis
6.
Genes (Basel) ; 14(3)2023 03 20.
Article in English | MEDLINE | ID: mdl-36981026

ABSTRACT

Single-cell RNA sequencing (scRNA-seq) is a method that focuses on the analysis of gene expression profile in individual cells. This method has been successfully applied to answer the challenging questions of the pathogenesis of multifactorial diseases and open up new possibilities in the prognosis and prevention of reproductive diseases. In this article, we have reviewed the application of scRNA-seq to the analysis of the various cell types and their gene expression changes in normal pregnancy and pregnancy complications. The main principle, advantages, and limitations of single-cell technologies and data analysis methods are described. We discuss the possibilities of using the scRNA-seq method for solving the fundamental and applied tasks related to various pregnancy-associated disorders. Finally, we provide an overview of the scRNA-seq findings for the common pregnancy-associated conditions, such as hyperglycemia in pregnancy, recurrent pregnancy loss, preterm labor, polycystic ovary syndrome, and pre-eclampsia.


Subject(s)
Gene Expression Profiling , Pre-Eclampsia , Pregnancy , Female , Infant, Newborn , Humans , Gene Expression Profiling/methods , Single-Cell Analysis/methods , Transcriptome , Sequence Analysis, RNA/methods
7.
Genes (Basel) ; 15(1)2023 Dec 27.
Article in English | MEDLINE | ID: mdl-38254935

ABSTRACT

A male factor, commonly associated with poor semen quality, is revealed in about 50% of infertile couples. CFTR gene (Cystic Fibrosis Transmembrane Conduction Regulator) variants are one of the common genetic causes of azoospermia-related male infertility. Notably, the spectrum and frequency of pathogenic CFTR variants vary between populations and geographical regions. In this work, we made an attempt to evaluate the allele frequency (AF) of 12 common CFTR variants in infertile Russian men and healthy individuals from different districts of Russia. Because of the limited number of population-based studies on Russian individuals, we characterized the population AFs based on data from the Registry of Russian cystic fibrosis (CF) patients. In addition to the CF patient registry, we estimated the local frequencies of the same set of variants based on the results of genotyping of CF patients in local biocollections (from St. Petersburg and Yugra regions). AFs of common CFTR variants calculated based on registry and biocollection data showed good concordance with directly measured population AFs. The estimated region-specific frequencies of CFTR variants allowed us to uncover statistically significant regional differences in the frequencies of the F508del (c.1521_1523del; p.Phe508del) and CFTRdele2,3(21kb) (c.54-5940_273+10250del21kb; p.Ser18ArgfsX) variants. The data from population-based studies confirmed previous observations that F508del, CFTRdele2,3(21kb), and L138ins (c.413_415dup; p.Leu138dup)variants are the most abundant among infertile patients, and their frequencies are significantly lower in healthy individuals and should be taken into account during genetic monitoring of the reproductive health of Russian individuals.


Subject(s)
Cystic Fibrosis , Infertility, Male , Humans , Male , Cystic Fibrosis/genetics , Cystic Fibrosis Transmembrane Conductance Regulator/genetics , Gene Frequency , Infertility, Male/genetics , Semen Analysis , Female
8.
Biology (Basel) ; 13(1)2023 Dec 23.
Article in English | MEDLINE | ID: mdl-38248441

ABSTRACT

Genome-wide association studies (GWAS) have proven to be a powerful tool for the identification of genetic susceptibility loci affecting human complex traits. In addition to pinpointing individual genes involved in a particular trait, GWAS results can be used to discover relevant biological processes for these traits. The development of new tools for extracting such information from GWAS results requires large-scale datasets with known biological ground truth. Simulation of GWAS results is a powerful method that may provide such datasets and facilitate the development of new methods. In this work, we developed bioGWAS, a simple and flexible pipeline for the simulation of genotypes, phenotypes, and GWAS summary statistics. Unlike existing methods, bioGWAS can be used to generate GWAS results for simulated quantitative and binary traits with a predefined set of causal genetic variants and/or molecular pathways. We demonstrate that the proposed method can recapitulate complete GWAS datasets using a set of reported genome-wide associations. We also used our method to benchmark several tools for gene set enrichment analysis for GWAS data. Taken together, our results suggest that bioGWAS provides an important set of functionalities that would aid the development of new methods for downstream processing of GWAS results.

9.
J Pers Med ; 12(12)2022 Dec 09.
Article in English | MEDLINE | ID: mdl-36556260

ABSTRACT

In recent years, great advances have been made in the field of collection, storage, and analysis of biological samples. Large collections of samples, biobanks, have been established in many countries. Biobanks typically collect large amounts of biological samples and associated clinical information; the largest collections include over a million samples. In this review, we summarize the main directions in which biobanks aid medical genetics and genomic research, from providing reference allele frequency information to allowing large-scale cross-ancestry meta-analyses. The largest biobanks greatly vary in the size of the collection, and the amount of available phenotype and genotype data. Nevertheless, all of them are extensively used in genomics, providing a rich resource for genome-wide association analysis, genetic epidemiology, and statistical research into the structure, function, and evolution of the human genome. Recently, multiple research efforts were based on trans-biobank data integration, which increases sample size and allows for the identification of robust genetic associations. We provide prominent examples of such data integration and discuss important caveats which have to be taken into account in trans-biobank research.

10.
Biology (Basel) ; 11(12)2022 Nov 22.
Article in English | MEDLINE | ID: mdl-36552198

ABSTRACT

Yeast is a convenient model for studying protein aggregation as it is known to propagate amyloid prions. [PSI+] is the prion form of the release factor eRF3 (Sup35). Aggregated Sup35 causes defects in termination of translation, which results in nonsense suppression in strains carrying premature stop codons. N-terminal and middle (M) domains of Sup35 are necessary and sufficient for maintaining [PSI+] in cells while preserving the prion strain's properties. For this reason, Sup35NM fused to fluorescent proteins is often used for [PSI+] detection and investigation. However, we found that in such chimeric constructs, not all fluorescent proteins allow the reliable detection of Sup35 aggregates. Particularly, transient overproduction of Sup35NM-mCherry resulted in a diffuse fluorescent pattern in the [PSI+] cells, while no loss of prions and no effect on the Sup35NM prion properties could be observed. This effect was reproduced in various unrelated strain backgrounds and prion variants. In contrast, Sup35NM fused to another red fluorescent protein, TagRFP-T, allowed the detection of [PSI+] aggregates. Analysis of protein lysates showed that Sup35NM-mCherry is actively degraded in the cell. This degradation was not caused by vacuolar proteases and the ubiquitin-proteasomal system implicated in the Sup35 processing. Even though the intensity of this proteolysis was higher than that of Sup35NM-GFP, it was roughly the same as in the case of Sup35NM-TagRFP-T. Thus, it is possible that, in contrast to TagRFP-T, degradation products of Sup35NM-mCherry still preserve their fluorescent properties while losing the ability to decorate pre-existing Sup35 aggregates. This results in diffuse fluorescence despite the presence of the prion aggregates in the cell. Thus, tagging with fluorescent proteins should be used with caution, as such proteolysis may increase the rate of false-negative results when detecting prion-bearing cells.

11.
Genes (Basel) ; 13(12)2022 11 30.
Article in English | MEDLINE | ID: mdl-36553520

ABSTRACT

Complications endangering mother or fetus affect around one in seven pregnant women. Investigation of the genetic susceptibility to such diseases is of high importance for better understanding of the disease biology as well as for prediction of individual risk. In this study, we collected and analyzed GWAS summary statistics from the FinnGen cohort and UK Biobank for 24 pregnancy complications. In FinnGen, we identified 11 loci associated with pregnancy hypertension, excessive vomiting, and gestational diabetes. When UK Biobank and FinnGen data were combined, we discovered six loci reaching genome-wide significance in the meta-analysis. These include rs35954793 in FGF5 (p=6.1×10-9), rs10882398 in PLCE1 (p=8.9×10-9), and rs167479 in RGL3 (p=5.2×10-9) for pregnancy hypertension, rs10830963 in MTNR1B (p=4.5×10-41) and rs36090025 in TCF7L2 (p=3.4×10-15) for gestational diabetes, and rs2963457 in the EBF1 locus (p=6.5×10-9) for preterm birth. In addition to the identified genome-wide associations, we also replicated 14 out of 40 previously reported GWAS markers for pregnancy complications, including four more preeclampsia-related variants. Finally, annotation of the GWAS results identified a causal relationship between gene expression in the cervix and gestational hypertension, as well as both known and previously uncharacterized genetic correlations between pregnancy complications and other traits. These results suggest new prospects for research into the etiology and pathogenesis of pregnancy complications, as well as early risk prediction for these disorders.


Subject(s)
Diabetes, Gestational , Hypertension , Pregnancy Complications , Premature Birth , Infant, Newborn , Humans , Female , Pregnancy , Genome-Wide Association Study , Diabetes, Gestational/genetics , Biological Specimen Banks , Pregnancy Complications/genetics , United Kingdom
12.
Biochemistry (Mosc) ; 87(5): 450-463, 2022 May.
Article in English | MEDLINE | ID: mdl-35790379

ABSTRACT

Amyloids are protein aggregates with the cross-ß structure. The interest in amyloids is explained, on the one hand, by their role in the development of socially significant human neurodegenerative diseases, and on the other hand, by the discovery of functional amyloids, whose formation is an integral part of cellular processes. To date, more than a hundred proteins with the amyloid or amyloid-like properties have been identified. Studying the structure of amyloid aggregates has revealed a wide variety of protein conformations. In the review, we discuss the diversity of protein folds in the amyloid-like aggregates and the characteristic features of amyloid aggregates that determine their unusual properties, including stability and interaction with amyloid-specific dyes. The review also describes the diversity of amyloid aggregates and its significance for living organisms.


Subject(s)
Amyloidogenic Proteins , Amyloidosis , Amyloid/metabolism , Amyloidosis/genetics , Humans , Polymorphism, Genetic , Protein Conformation
13.
Genes (Basel) ; 13(8)2022 07 23.
Article in English | MEDLINE | ID: mdl-35893047

ABSTRACT

Metformin is an oral hypoglycemic agent widely used in clinical practice for treatment of patients with type 2 diabetes mellitus (T2DM). The wide interindividual variability of response to metformin therapy was shown, and recently the impact of several genetic variants was reported. To assess the independent and combined effect of the genetic polymorphism on glycemic response to metformin, we performed an association analysis of the variants in ATM, SLC22A1, SLC47A1, and SLC2A2 genes with metformin response in 299 patients with T2DM. Likewise, the distribution of allele and genotype frequencies of the studied gene variants was analyzed in an extended group of patients with T2DM (n = 464) and a population group (n = 129). According to our results, one variant, rs12208357 in the SLC22A1 gene, had a significant impact on response to metformin in T2DM patients. Carriers of TT genotype and T allele had a lower response to metformin compared to carriers of CC/CT genotypes and C allele (p-value = 0.0246, p-value = 0.0059, respectively). To identify the parameters that had the greatest importance for the prediction of the therapy response to metformin, we next built a set of machine learning models, based on the various combinations of genetic and phenotypic characteristics. The model based on a set of four parameters, including gender, rs12208357 genotype, familial T2DM background, and waist-hip ratio (WHR) showed the highest prediction accuracy for the response to metformin therapy in patients with T2DM (AUC = 0.62 in cross-validation). Further pharmacogenetic studies may aid in the discovery of the fundamental mechanisms of type 2 diabetes, the identification of new drug targets, and finally, it could advance the development of personalized treatment.


Subject(s)
Diabetes Mellitus, Type 2 , Metformin , Blood Glucose/genetics , Diabetes Mellitus, Type 2/drug therapy , Diabetes Mellitus, Type 2/genetics , Humans , Hypoglycemic Agents/therapeutic use , Metformin/therapeutic use , Polymorphism, Single Nucleotide
14.
Genes (Basel) ; 13(7)2022 06 30.
Article in English | MEDLINE | ID: mdl-35885959

ABSTRACT

Type 2 diabetes (T2D) is a common chronic disease whose etiology is known to have a strong genetic component. Standard genetic approaches, although allowing for the detection of a number of gene variants associated with the disease as well as differentially expressed genes, cannot fully explain the hereditary factor in T2D. The explosive growth in the genomic sequencing technologies over the last decades provided an exceptional impetus for transcriptomic studies and new approaches to gene expression measurement, such as RNA-sequencing (RNA-seq) and single-cell technologies. The transcriptomic analysis has the potential to find new biomarkers to identify risk groups for developing T2D and its microvascular and macrovascular complications, which will significantly affect the strategies for early diagnosis, treatment, and preventing the development of complications. In this article, we focused on transcriptomic studies conducted using expression arrays, RNA-seq, and single-cell sequencing to highlight recent findings related to T2D and challenges associated with transcriptome experiments.


Subject(s)
Diabetes Mellitus, Type 2 , Transcriptome , Biomarkers , Diabetes Mellitus, Type 2/complications , Diabetes Mellitus, Type 2/genetics , Gene Expression Profiling , Humans , Sequence Analysis, RNA , Transcriptome/genetics
15.
Front Genet ; 13: 846101, 2022.
Article in English | MEDLINE | ID: mdl-35664296

ABSTRACT

Introduction: Floating Harbor syndrome (FHS) is an extremely rare disorder, with slightly more than a hundred cases reported worldwide. FHS is caused by heterozygous mutations in the SRCAP gene; however, little is known about the pathogenesis of FHS or the effectiveness of its treatment. Methods: Whole-exome sequencing (WES) was performed for the definitive molecular diagnosis of the disease. Identified variants were validated using Sanger sequencing. In addition, systematic literature and public data on genetic variation in SRCAP and the effects of growth hormone (GH) treatment was conducted. Results: We herein report the first case of FHS in the Russian Federation. The male proband presented with most of the typical phenotypic features of FHS, including short stature, skeletal and facial features, delayed growth and bone age, high pitched voice, and intellectual impairment. The proband also had partial growth hormone deficiency. We report the history of treatment of the proband with GH, which resulted in modest improvement in growth prior to puberty. WES revealed a pathogenic c.7466C>G (p.Ser2489*) mutation in the last exon of the FHS-linked SRCAP gene. A systematic literature review and analysis of available genetic variation datasets highlighted an unusual distribution of pathogenic variants in SRCAP and confirmed the lack of pathogenicity for variants outside of exons 33 and 34. Finally, we suggested a new model of FHS pathogenesis which provides possible basis for the dominant negative nature of FHS-causing mutations and explains limited effects of GH treatment in FHS. Conclusion: Our findings expand the number of reported FHS cases and provide new insights into disease genetics and the efficiency of GH therapy for FHS patients.

16.
Genes (Basel) ; 13(4)2022 03 24.
Article in English | MEDLINE | ID: mdl-35456380

ABSTRACT

Although high altitude training has been increasingly popular among endurance athletes, the molecular and cellular bases of this adaptation remain poorly understood. We aimed to define the underlying physiological changes and screen for potential biomarkers of adaptation using transcriptional profiling of whole blood. Seven elite female speed skaters were profiled on the 18th day of high-altitude adaptation. Whole blood RNA-seq before and after an intense 1 h skating bout was used to measure gene expression changes associated with exercise. In order to identify the genes specifically regulated at high altitudes, we have leveraged the data from eight previously published microarray datasets studying blood expression changes after exercise at sea level. Using cell type-specific signatures, we were able to deconvolute changes of cell type abundance from individual gene expression changes. Among these were PHOSPHO1, with a known role in erythropoiesis, and MARC1 with a role in endogenic NO metabolism. We find that platelet and erythrocyte counts uniquely respond to altitude exercise, while changes in neutrophils represent a more generic marker of intense exercise. Publicly available data from both single cell atlases and exercise-related blood profiling dramatically increases the value of whole blood RNA-seq for the dynamic evaluation of physiological changes in an athlete's body.


Subject(s)
Altitude , Exercise , Acclimatization , Athletes , Exercise/physiology , Female , Humans , Sequence Analysis, RNA
17.
Genes (Basel) ; 13(3)2022 03 17.
Article in English | MEDLINE | ID: mdl-35328087

ABSTRACT

The COVID-19 pandemic has drawn the attention of many researchers to the interaction between pathogen and host genomes. Over the last two years, numerous studies have been conducted to identify the genetic risk factors that predict COVID-19 severity and outcome. However, such an analysis might be complicated in cohorts of limited size and/or in case of limited breadth of genome coverage. In this work, we tried to circumvent these challenges by searching for candidate genes and genetic variants associated with a variety of quantitative and binary traits in a cohort of 840 COVID-19 patients from Russia. While we found no gene- or pathway-level associations with the disease severity and outcome, we discovered eleven independent candidate loci associated with quantitative traits in COVID-19 patients. Out of these, the most significant associations correspond to rs1651553 in MYH14p = 1.4 × 10-7), rs11243705 in SETX (p = 8.2 × 10-6), and rs16885 in ATXN1 (p = 1.3 × 10-5). One of the identified variants, rs33985936 in SCN11A, was successfully replicated in an independent study, and three of the variants were found to be associated with blood-related quantitative traits according to the UK Biobank data (rs33985936 in SCN11A, rs16885 in ATXN1, and rs4747194 in CDH23). Moreover, we show that a risk score based on these variants can predict the severity and outcome of hospitalization in our cohort of patients. Given these findings, we believe that our work may serve as proof-of-concept study demonstrating the utility of quantitative traits and extensive phenotyping for identification of genetic risk factors of severe COVID-19.


Subject(s)
COVID-19 , COVID-19/genetics , COVID-19/pathology , Cohort Studies , Genome-Wide Association Study , Humans , Pandemics , Patient Acuity , Risk Factors , Russia
18.
BMC Genomics ; 23(1): 155, 2022 Feb 22.
Article in English | MEDLINE | ID: mdl-35193511

ABSTRACT

BACKGROUND: Accurate variant detection in the coding regions of the human genome is a key requirement for molecular diagnostics of Mendelian disorders. Efficiency of variant discovery from next-generation sequencing (NGS) data depends on multiple factors, including reproducible coverage biases of NGS methods and the performance of read alignment and variant calling software. Although variant caller benchmarks are published constantly, no previous publications have leveraged the full extent of available gold standard whole-genome (WGS) and whole-exome (WES) sequencing datasets. RESULTS: In this work, we systematically evaluated the performance of 4 popular short read aligners (Bowtie2, BWA, Isaac, and Novoalign) and 9 novel and well-established variant calling and filtering methods (Clair3, DeepVariant, Octopus, GATK, FreeBayes, and Strelka2) using a set of 14 "gold standard" WES and WGS datasets available from Genome In A Bottle (GIAB) consortium. Additionally, we have indirectly evaluated each pipeline's performance using a set of 6 non-GIAB samples of African and Russian ethnicity. In our benchmark, Bowtie2 performed significantly worse than other aligners, suggesting it should not be used for medical variant calling. When other aligners were considered, the accuracy of variant discovery mostly depended on the variant caller and not the read aligner. Among the tested variant callers, DeepVariant consistently showed the best performance and the highest robustness. Other actively developed tools, such as Clair3, Octopus, and Strelka2, also performed well, although their efficiency had greater dependence on the quality and type of the input data. We have also compared the consistency of variant calls in GIAB and non-GIAB samples. With few important caveats, best-performing tools have shown little evidence of overfitting. CONCLUSIONS: The results show surprisingly large differences in the performance of cutting-edge tools even in high confidence regions of the coding genome. This highlights the importance of regular benchmarking of quickly evolving tools and pipelines. We also discuss the need for a more diverse set of gold standard genomes that would include samples of African, Hispanic, or mixed ancestry. Additionally, there is also a need for better variant caller assessment in the repetitive regions of the coding genome.


Subject(s)
Benchmarking , Polymorphism, Single Nucleotide , Exome , High-Throughput Nucleotide Sequencing/methods , Humans , Software
19.
J Fungi (Basel) ; 8(2)2022 Jan 27.
Article in English | MEDLINE | ID: mdl-35205876

ABSTRACT

Baker's yeast Saccharomyces cerevisiae is an important model organism that is applied to study various aspects of eukaryotic cell biology. Prions in yeast are self-perpetuating heritable protein aggregates that can be leveraged to study the interaction between the protein quality control (PQC) machinery and misfolded proteins. More than ten prions have been identified in yeast, of which the most studied ones include [PSI+], [URE3], and [PIN+]. While all of the major molecular chaperones have been implicated in propagation of yeast prions, many of these chaperones differentially impact propagation of different prions and/or prion variants. In this review, we summarize the current understanding of the life cycle of yeast prions and systematically review the effects of different chaperone proteins on their propagation. Our analysis clearly shows that Hsp40 proteins play a central role in prion propagation by determining the fate of prion seeds and other amyloids. Moreover, direct prion-chaperone interaction seems to be critically important for proper recruitment of all PQC components to the aggregate. Recent results also suggest that the cell asymmetry apparatus, cytoskeleton, and cell signaling all contribute to the complex network of prion interaction with the yeast cell.

20.
Genes (Basel) ; 12(12)2021 12 19.
Article in English | MEDLINE | ID: mdl-34946968

ABSTRACT

Protein synthesis (translation) is one of the fundamental processes occurring in the cells of living organisms. Translation can be divided into three key steps: initiation, elongation, and termination. In the yeast Saccharomyces cerevisiae, there are two translation termination factors, eRF1 and eRF3. These factors are encoded by the SUP45 and SUP35 genes, which are essential; deletion of any of them leads to the death of yeast cells. However, viable strains with nonsense mutations in both the SUP35 and SUP45 genes were previously obtained in several groups. The survival of such mutants clearly involves feedback control of premature stop codon readthrough; however, the exact molecular basis of such feedback control remain unclear. To investigate the genetic factors supporting the viability of these SUP35 and SUP45 nonsense mutants, we performed whole-genome sequencing of strains carrying mutant sup35-n and sup45-n alleles; while no common SNPs or indels were found in these genomes, we discovered a systematic increase in the copy number of the plasmids carrying mutant sup35-n and sup45-n alleles. We used the qPCR method which confirmed the differences in the relative number of SUP35 and SUP45 gene copies between strains carrying wild-type or mutant alleles of SUP35 and SUP45 genes. Moreover, we compare the number of copies of the SUP35 and SUP45 genes in strains carrying different nonsense mutant variants of these genes as a single chromosomal copy. qPCR results indicate that the number of mutant gene copies is increased compared to the wild-type control. In case of several sup45-n alleles, this was due to a disomy of the entire chromosome II, while for the sup35-218 mutation we observed a local duplication of a segment of chromosome IV containing the SUP35 gene. Taken together, our results indicate that gene amplification is a common mechanism of adaptation to nonsense mutations in release factor genes in yeast.


Subject(s)
Gene Amplification , Peptide Termination Factors/genetics , Saccharomyces cerevisiae Proteins/genetics , Saccharomyces cerevisiae/growth & development , Adaptation, Physiological , Chromosomes, Fungal/genetics , Codon, Nonsense , Saccharomyces cerevisiae/genetics , Whole Genome Sequencing
SELECTION OF CITATIONS
SEARCH DETAIL
...